Search CORE

26 research outputs found

Kernel Graph Convolutional Neural Networks

Author: Meladianos Polykarpos
Nikolentzos Giannis
Skianis Konstantinos
Tixier Antoine Jean-Pierre
Vazirgiannis Michalis
Publication venue
Publication date: 07/09/2018
Field of study

Graph kernels have been successfully applied to many graph classification problems. Typically, a kernel is first designed, and then an SVM classifier is trained based on the features defined implicitly by this kernel. This two-stage approach decouples data representation from learning, which is suboptimal. On the other hand, Convolutional Neural Networks (CNNs) have the capability to learn their own features directly from the raw data during training. Unfortunately, they cannot handle irregular data such as graphs. We address this challenge by using graph kernels to embed meaningful local neighborhoods of the graphs in a continuous vector space. A set of filters is then convolved with these patches, pooled, and the output is then passed to a feedforward network. With limited parameter tuning, our approach outperforms strong baselines on 7 out of 10 benchmark datasets.Comment: Accepted at ICANN '1

arXiv.org e-Print Archive

Crossref

Boosting Tricks for Word Mover's Distance

Author: Malliaros Fragkiskos,
Skianis Konstantinos
Tziortziotis Nikolaos
Vazirgiannis Michalis
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 15/09/2020
Field of study

Due to the COVID-19 pandemic, the physical meeting of ICANN 2020 has been postponed. The event is scheduled next year’s ICANN in September 2021 in Bratislava, Slovakia.International audienceWord embeddings have opened a new path in creating novel approaches for addressing traditional problems in the natural language processing (NLP) domain. However, using word embeddings to compare text documents remains a relatively unexplored topic-with Word Mover's Distance (WMD) being the prominent tool used so far. In this paper, we present a variety of tools that can further improve the computation of distances between documents based on WMD. We demonstrate that, alternative stopwords, cross document-topic comparison, deep contextualized word vectors and convex metric learning, constitute powerful tools that can boost WMD

INRIA a CCSD electronic archive server

Nouvelles Représentations, la Régularisation et les Distances pour la Classification de Texte

Author: Skianis Konstantinos
Publication venue
Publication date: 01/03/2019
Field of study

Le texte a été le moyen dominant de stocker des données dans des systèmes infor- matiques et d’envoyer des informations sur le Web. L’extraction de représentations significatives hors du texte a été un élément clé de la modélisation de langage afin de traiter des tâches de la NLP telles que la classification de texte. Ces représentations peuvent ensuite former des groupes que l’on peut utiliser pour des problèmes d’apprentissage supervisé. Plus spécifiquement, on peut utiliser ces groupes linguistiques à des fins de régularisation. Enfin, ces structures peuvent être utiles dans un autre domaine important, le calcul de distance entre documents texte.L’objectif principal de cette thèse est d’étudier les problèmes susmentionnés; Tout d’abord, en examinant de nouvelles représentations de texte basées sur des graphes. Ensuite, nous avons étudié comment des groupes de ces représentations peuvent aider à la régularisation dans des modèles d’apprentissage automatique pour la classification de texte. Enfin, nous avons traité des ensembles et de la mesure des distances entre les documents, en utilisant les groupes linguistiques que nous avons proposés, ainsi que des approches basées sur des graphes.Dans la première partie de la thèse, nous avons étudié les représentations de texte basées sur des graphes. Transformer le texte en graphiques n’est pas anodin et existait avant même que les mots incorporés ne soient introduits dans la communauté NLP. Dans notre travail, nous montrons que les représentations graphiques de texte peuvent capturer efficacement des relations telles que l’ordre, la sémantique ou la structure syntaxique. De plus, ils peuvent être créés rapidement tout en offrant une grande polyvalence pour de multiples tâches.Dans la deuxième partie, nous nous sommes concentrés sur la régularisation structurée du texte. Les données textuelles souffrent du problème de dimensionnalité, créant de grands espaces de fonctionnalités. La régularisation est essentielle pour tout modèle d’apprentissage automatique, car elle permet de remédier au surajustement. Dans notre travail, nous présentons de nouvelles approches pour la régularisation de texte, en introduisant de nouveaux groupes de structures linguistiques et en concevant de nouveaux algorithmes.Dans la dernière partie de la thèse, nous étudions de nouvelles méthodes pour mesurer la distance dans le mot englobant l’espace. Premièrement, nous présentons diverses méthodes pour améliorer la comparaison entre des documents constitués de vecteurs de mots. Ensuite, en présentant la comparaison des documents comme une correspondance bipartite pondérée, nous montrons comment nous pouvons apprendre des représentations cachées et améliorer les résultats pour la tâche de classification de texte.Enfin, nous conclurons en résumant les principaux points de la contribution totale et en discutant des orientations futures..Text has been the dominant way of storing data in computer systems and sending information around the Web. Extracting meaningful representations out of text has been a key element for modelling language in order to tackle NLP tasks like text classification. These representations can then form groups that one can use for supervised learning problems. More specifically, one can utilize these linguistic groups for regularization purposes. Last, these structures can be of help in another important field, distance computation between text documents.The main goal of this thesis is to study the aforementioned problems; first, by examining new graph-based representations of text. Next, we studied how groups of these representations can help regularization in machine learning mod- els for text classification. Last, we dealt with sets and measuring distances between documents, utilizing our proposed linguistic groups, as well as graph-based ap- proaches.In the first part of the thesis, we have studied graph-based representations of text. Turning text to graphs is not trivial and has been around even before word embeddings were introduced to the NLP community. In our work, we show that graph-based representations of text can capture effectively relationships like order, semantic or syntactic structure. Moreover, they can be created fast while offering great versatility for multiple tasks.In the second part, we focused on structured regularization for text. Textual data suffer from the dimensionality problem, creating huge feature spaces. Regu- larization is critical for any machine learning model, as it can address overfitting. In our work we present novel approaches for text regularization, by introducing new groups of linguistic structures and designing new algorithms.In the last part of the thesis, we study new methods to measure distance in the word embedding space. First, we introduce diverse methods to boost comparison between documents that consist of word vectors. Next, representing the comparison of the documents as a weighted bipartite matching, we show how we can learn hidden representations and improve results for the text classification task.Finally, we conclude by summarizing the main points of the total contribution and discuss future directions

Theses.fr

Nouvelles Représentations, la Régularisation et les Distances pour la Classification de Texte

Author: Skianis Konstantinos
Publication venue: HAL CCSD
Publication date: 01/03/2019
Field of study

Text has been the dominant way of storing data in computer systems and sending information around the Web. Extracting meaningful representations out of text has been a key element for modelling language in order to tackle NLP tasks like text classification. These representations can then form groups that one can use for supervised learning problems. More specifically, one can utilize these linguistic groups for regularization purposes. Last, these structures can be of help in another important field, distance computation between text documents.The main goal of this thesis is to study the aforementioned problems; first, by examining new graph-based representations of text. Next, we studied how groups of these representations can help regularization in machine learning mod- els for text classification. Last, we dealt with sets and measuring distances between documents, utilizing our proposed linguistic groups, as well as graph-based ap- proaches.In the first part of the thesis, we have studied graph-based representations of text. Turning text to graphs is not trivial and has been around even before word embeddings were introduced to the NLP community. In our work, we show that graph-based representations of text can capture effectively relationships like order, semantic or syntactic structure. Moreover, they can be created fast while offering great versatility for multiple tasks.In the second part, we focused on structured regularization for text. Textual data suffer from the dimensionality problem, creating huge feature spaces. Regu- larization is critical for any machine learning model, as it can address overfitting. In our work we present novel approaches for text regularization, by introducing new groups of linguistic structures and designing new algorithms.In the last part of the thesis, we study new methods to measure distance in the word embedding space. First, we introduce diverse methods to boost comparison between documents that consist of word vectors. Next, representing the comparison of the documents as a weighted bipartite matching, we show how we can learn hidden representations and improve results for the text classification task.Finally, we conclude by summarizing the main points of the total contribution and discuss future directions.Le texte a été le moyen dominant de stocker des données dans des systèmes infor- matiques et d’envoyer des informations sur le Web. L’extraction de représentations significatives hors du texte a été un élément clé de la modélisation de langage afin de traiter des tâches de la NLP telles que la classification de texte. Ces représentations peuvent ensuite former des groupes que l’on peut utiliser pour des problèmes d’apprentissage supervisé. Plus spécifiquement, on peut utiliser ces groupes linguistiques à des fins de régularisation. Enfin, ces structures peuvent être utiles dans un autre domaine important, le calcul de distance entre documents texte.L’objectif principal de cette thèse est d’étudier les problèmes susmentionnés; Tout d’abord, en examinant de nouvelles représentations de texte basées sur des graphes. Ensuite, nous avons étudié comment des groupes de ces représentations peuvent aider à la régularisation dans des modèles d’apprentissage automatique pour la classification de texte. Enfin, nous avons traité des ensembles et de la mesure des distances entre les documents, en utilisant les groupes linguistiques que nous avons proposés, ainsi que des approches basées sur des graphes.Dans la première partie de la thèse, nous avons étudié les représentations de texte basées sur des graphes. Transformer le texte en graphiques n’est pas anodin et existait avant même que les mots incorporés ne soient introduits dans la communauté NLP. Dans notre travail, nous montrons que les représentations graphiques de texte peuvent capturer efficacement des relations telles que l’ordre, la sémantique ou la structure syntaxique. De plus, ils peuvent être créés rapidement tout en offrant une grande polyvalence pour de multiples tâches.Dans la deuxième partie, nous nous sommes concentrés sur la régularisation structurée du texte. Les données textuelles souffrent du problème de dimensionnalité, créant de grands espaces de fonctionnalités. La régularisation est essentielle pour tout modèle d’apprentissage automatique, car elle permet de remédier au surajustement. Dans notre travail, nous présentons de nouvelles approches pour la régularisation de texte, en introduisant de nouveaux groupes de structures linguistiques et en concevant de nouveaux algorithmes.Dans la dernière partie de la thèse, nous étudions de nouvelles méthodes pour mesurer la distance dans le mot englobant l’espace. Premièrement, nous présentons diverses méthodes pour améliorer la comparaison entre des documents constitués de vecteurs de mots. Ensuite, en présentant la comparaison des documents comme une correspondance bipartite pondérée, nous montrons comment nous pouvons apprendre des représentations cachées et améliorer les résultats pour la tâche de classification de texte.Enfin, nous conclurons en résumant les principaux points de la contribution totale et en discutant des orientations futures.

Thèses en Ligne

Theses.fr

HAL-Polytechnique

Nouvelles Représentations, la Régularisation et les Distances pour la Classification de Texte

Author: Skianis Konstantinos
Publication venue: HAL CCSD
Publication date: 01/03/2019
Field of study

Thèses en Ligne

Cyber Threats to Industrial IoT: A Survey on Attacks and Countermeasures

Author: Charalabos Skianis
Dimitrios Taketzis
Konstantinos Demertzis
Konstantinos Tsiknas
Publication venue: 'MDPI AG'
Publication date: 07/03/2021
Field of study

In today’s Industrial Internet of Things (IIoT) environment, where different systems interact with the physical world, the state proposed by the Industry 4.0 standards can lead to escalating vulnerabilities, especially when these systems receive data streams from multiple intermediaries, requiring multilevel security approaches, in addition to link encryption. At the same time taking into account the heterogeneity of the systems included in the IIoT ecosystem and the non-institutionalized interoperability in terms of hardware and software, serious issues arise as to how to secure these systems. In this framework, given that the protection of industrial equipment is a requirement inextricably linked to technological developments and the use of the IoT, it is important to identify the major vulnerabilities and the associated risks and threats and to suggest the most appropriate countermeasures. In this context, this study provides a description of the attacks against IIoT systems, as well as a thorough analysis of the solutions for these attacks, as they have been proposed in the most recent literature

Multidisciplinary Digital Publishing Institute